BIOTEC ^LwV^ 
BRANCH 



RAW SEQUENCE LISTING 
K1RROR REPORT 




The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application Serial Number: 

Source: /C».'> "L. _ — . — 

Date Processed by STIC: O'l^l^ ^20^/ 



THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PI EASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

n mCLuS THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 

APPLICANT, WITH A NOTICE TO COMPLY or, 
2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 

N^JTTICE TO C^JIVIPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
PATENTIN 2.1 c-mail liclp: pntiii2Ihcln@.usnto.gov or plionc 703-306-4119 (R. Wax) 
PATENTIN 3.0 c-mail help: pfltin3licln@usDto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS. PLEASE USE THE CHECKER 
VF.RSION 3.0 PROGRAM . ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 



Checker Version 3.0 
The Checker Version 3.0 appUcation is a statc-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is m 
compliance with format and content rules. Checker Version 3 0 works for sequence listings 
generated for thcViginal version of 37 CFR §§1.821 - 1.825 effective October 1. 1990 (old 
rules) and the revised version (new rules) effective July 1. 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25. 

Checker Version 3.0 replaces the previous DOS-based version of Checker, and is Y2K- 
compliant Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submiuing them to the United States Patent and Trademark Office (USPTO). 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 

Checker Version 3.0 can be down loaded from the IISPT O website at the following addrcssj 

litt[)://ww\v.usi)to.govAvcb/officcs/pac/chccUcr 
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1652 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/486,247 



DATE: 09/24/2001 
TIME: 12:08:14 



Input Set : A:\8484081999.txt 

Output Set: N:\CRF3\09242001\I486247.raw 



5 <110> APPLICANT: DEAR, TERENCE N 

7 BOEHM, THOMAS 

11 <120> TITLE OF INVENTION: PROTEASE- RELATED PROTEIN 
15 <130> FILE REFERENCE: 8484-081-999 
19 <140> CURRENT APPLICATION NUMBER: 09/486,247 
21 <141> CURRENT FILING DATE: 2000-05-25 

23 <150> PRIOR APPLICATION NUMBER: DE 197 36 198.6 

24 <151> PRIOR FILING DATE: 1997-08*20 
27 <160> NUMBER OF SEQ ID NOS : 8 

31 <170> SOFTWARE: Patentin version 3.1 
35 <210> SEQ ID NO: 1 
37 <211> LENGTH: 822 
39 <212> TYPE: DNA 

41 <213> ORGANISM: Artificial Sequence 
45 <220> FEATURE: 

47 <223> OTHER INFORMATION: Description of Artificial 

4 9 <221> NAME/KEY: CDS 

51 <222> LOCATION: (1)..(822) 

53 <223> OTHER INFORMATION: 

56 <400> SEQUENCE: 1 

57 tag gtg gtg tea ttc ccc tec aac ctg agt get ggc agg 

58 Val Val Ser Phe Pro Ser Asn Leu Ser Ala Gly Arg 

59 1 5 10 

64 ggc cac cag cag atg ccc atg aag atg ctg aca atg aag 

65 Gly His Gin Gin Met Pro Met Lys Met Leu Thr Met Lys 

66 20 25 

68 ctg tgc ttg gtt ctt get aaa tea gee tgg teg gag gaa 

69 Leu Cys Leu Val Leu Ala Lys Ser Ala Trp Ser Glu Glu 

70 35 40 

72 gtg gtt cat gga ggc ccg tgt ttg aag gac tec cac cet 

73 Val Val His Gly Gly Pro Cys Leu Lys Asp Ser His Pro 

74 50 55 60 

76 gee etc tac ace tea ggt cac ttg ctg tgt ggt ggg gtc 

77 Ala Leu Tyr Thr Ser Gly His Leu Leu Cys Gly Gly Val 

78 65 70 75 

80 eca cag tgg gtg ctg aca get gee cac tgc aaa aaa ccg 

81 Pro Gin Trp Val Leu Thr Ala Ala His Cys Lys Lys Pro 

82 80 85 90 

84 gtg ate ttg ggg aaa cac aac eta egg caa aca gag act 

85 Val He Leu Gly Lys His Asn Leu Arg Gin Thr Glu Thr 

86 100 105 

88 caa ate tea gtg gac agg act att gtc cat ccc cgc tac 

89 Gin He Ser Val Asp Arg Thr He Val His Pro Arg Tyr 

90 115 120 

92 ace cac gac aat gac ate atg atg gtg cat ctg aaa aat 

93 Thr His Asp Asn Asp He Met Met Val His Leu Lys Asn 

94 130 135 140 



Does Not Comply 
Corrected Diskette Needed 



Sequence : I Polynucleotide 



tac act get ^8 _ 



Tyr Thr Ala 
15 

atg ctg gee 
Met Leu Ala 
30 

cag gag aag 
Gin Glu Lys 
45 

ttc cag get 
Phe Gin Ala 

etc att gac 
Leu He Asp 



96 



aat 
Asn 

ttc 
Phe 

aac 
Asn 
125 
eca 
Pro 



ctg 
Leu 

caa 
Gin 
110 
cet 
Pro 



cag 

Gin 

95 

agg 

Arg 

gaa 
Glu 



gtc aaa 
Val Lys 



192 



240 



288 



336 



384 



432 
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RAW SEQUENCE LISTING DATE: 09/24/2001 

PATENT APPLICATION: US/09/4 86,247 TIME: 12:08:14 

Input Set : A:\8484081999.txt 

Output Set: N:\CRF3\09242001\I486247.raw 

96 ttc tot aaa aag ate cag cot ctg ccc ttg aag aat gac tgc tot gag 480 

97 Phe Ser Lys Lys lie Gin Pro Leu Pro Leu Lys Asn Asp Cys Ser Glu 



98 


145 








150 








155 












100 


gag 


aat 


ccc 


aac 


tgc 


cag 


ate 


ctg 


ggc 


tgg 


ggc 


aag 


atg 


gaa 


aat 


ggt 


528 


101 


Glu 


Asn 


Pro 


Asn 


Cys 


Gin 


He 


Leu 


Gly 


Trp 


Gly 


Lys 


Met 


Glu 


Asn 


Gly 




102 


160 










165 










170 










175 




104 


gac 


ttc 


cca 


gat 


acc 


att 


cag 


tgt 


get 


gat 


gtc 


eat 


ctg 


gtg 


ccc 


egg 


576 


105 


Asp 


Phe 


Pro 


Asp 


Thr 


He 


Gin 


Cys 


Ala 


Asp 


Val 


His 


Leu 


Val 


Pro 


Arg 




106 










180 










185 










190 






108 


gag 


cag 


tgt 


gag 


cgt 


gcc 


tac 


cct 


ggc 


aag 


ate 


ace 


cag 


age 


atg 


gtg 


624 


109 


Glu 


Gin 


Cys 


Glu 


Arg 


Ala 


Tyr 


Pro 


Gly 


Lys 


He 


Thr 


Gin 


Ser 


Met 


Val 




110 








195 










200 










205 








112 


tgc 


gca 


ggc 


gac 


atg 


aaa 


gaa 


ggc 


aac 


gat 


tec 


tgt 


cag 


ggt 


gat 


tet 


672 


113 


Cys 


Ala 


Gly 


Asp 


Met 


Lys 


Glu 


Gly 


Asn 


Asp 


Ser 


Cys 


Gin 


Gly 


Asp 


Ser 




114 






210 










215 










220 










116 


gga 


ggt 


ccc 


eta 


gta 


tgt 


ggg 


ggt 


egc 


etc 


ega 


ggg 


etc 


gtg 


tea 


tgg 


720 


117 


Gly 


Gly 


Pro 


Leu 


Val 


Cys 


Gly 


Gly 


Arg 


Leu 


Arg 


Gly 


Leu 


Val 


Ser 


Trp 




118 




225 










230 










235 












120 


ggt 


gac 


atg 


ccc 


tgt 


gga 


tea 


aag 


gag 


aag 


cca 


gga 


gtt 


tac 


acc 


gat 


768 


121 


Gly 


Asp 


Met 


Pro 


Cys 


Gly 


Ser 


Lys 


Glu 


Lys 


Pro 


Gly 


Val 


Tyr 


Thr 


Asp 




122 


240 










245 










250 










255 




126 


gtc 


tgc 


act 


cat 


ate 


aga 


tgg 


ate 


caa 


aac 


ate 


etc 


aga 


aac 


aag 


tgg 


816 


127 


Val 


Cys 


Thr 


His 


He 


Arg 


Trp 


He 


Gin 


Asn 


He 


Leu 


Arg 


Asn 


Lys 


Trp 




128 










260 










265 










270 






130 


ctg 


tga 






























822 


131 


Leu 


































134 


<210> SEQ ID NO; 


: 2 


























136 


<211> LENGTH: 272 


























138 


<212> TYPE: 


PRT 




























140 


<213> ORGANISM: 


Artificial Sequence 


















142 


<220> FEATURE: 




























144 


<223> OTHER 


INFORMATION 


: Description of 


Artificial Sequence: 


; Polynucleotide 


146 


<400> SEQUENCE: 


2 


























148 


Val 


Val 


Ser 


Phe 


Pro 


Ser 


Asn 


Leu 


Ser 


Ala 


Gly 


Arg 


Tyr 


Thr 


Ala 


Gly 


\ 


149 
152 


1 

His 


Gin 


Gin 


Met 


5 

Pro 


Met 


Lys 


Met 


Leu 


10 
Thr 


Met 


Lys 


Met 


Leu 


15 
Ala 


Leu 




153 








20 










25 










30 








156 


Cys 


Leu 


Val 


Leu 


Ala 


Lys 


Ser 


Ala 


Trp 


Ser 


Glu 


Glu 


Gin 


Glu 


Lys 


Val 


157 






35 










40 










45 








160 


Val 


His 


Gly 


Gly 


Pro 


Cys 


Leu 


Lys 


Asp 


Ser 


His 


Pro 


Phe 


Gin 


Ala 


Ala 




161 




50 










55 










60 












164 


Leu 


Tyr 


Thr 


Ser 


Gly 


His 


Leu 


Leu 


Cys 


Gly 


Gly 


Val 


Leu 


He 


Asp 


Pro 




165 


65 










70 










75 










80 




168 


Gin 


Trp 


Val 


Leu 


Thr 


Ala 


Ala 


His 


Cys 


Lys 


Lys 


Pro 


Asn 


Leu 


Gin 


Val 




169 










85 










90 










95 






172 


He 


Leu 


Gly 


Lys 


His 


Asn 


Leu 


Arg 


Gin 


Thr 


Glu 


Thr 


Phe 


Gin 


Arg 


Gin 




173 








100 










105 










110 








176 


He 


Ser 


Val 


Asp 


Arg 


Thr 


He 


Val 


His 


Pro 


Arg 


Tyr 


Asn 


Pro 


Glu 


Thr 




177 






115 










120 










125 
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RAW SEQUENCE LISTING DATE: 
PATENT APPLICATION: US/09/486,247 TIME: 

Input Set : A:\8484081999.txt 

Output Set: N:\CRF3\09242001\I486247.raw 



09/24/2001 
12:08:14 



180 


His 


Asp 


Asn 


Asp 


He 


Met 


Met 


Val 


His 


Leu 


Lys 


Asn 


Pro 


Val 


Lys Phe 


181 




130 










135 










140 








185 


Ser 


Lys 


Lys 


He 


Gin 


Pro 


Leu 


Pro 


Leu 


Lys 


Asn 


Asp 


Cys 


Ser 


Glu Glu 


186 


145 










150 










155 








160 


189 


Asn 


Pro 


Asn 


Cys 


Gin 


He 


Leu 


Gly Trp 


Gly 


Lys 


Met 


Glu 


Asn 


Gly Asp 


190 










165 










170 










175 


193 


Phe 


Pro 


Asp 


Thr 


He 


Gin 


Cys 


Ala 


Asp 


Val 


His 


Leu 


Val 


Pro 


Arg Glu 


194 








180 










185 










190 




198 


Gin 


Cys 


Glu 


Arg 


Ala 


Tyr 


Pro 


Gly 


Lys 


He 


Thr 


Gin 


Ser 


Met 


Val Cys 


199 






195 










200 










205 






202 


Ala 


Gly Asp 


Met 


Lys 


Glu 


Gly 


Asn 


Asp 


Ser 


Cys 


Gin 


Gly 


Asp 


Ser Gly 


203 




210 










215 










220 








206 


Gly 


Pro 


Leu 


Val 


Cys 


Gly 


Gly Arg 


Leu 


Arg 


Gly 


Leu 


Val 


Ser 


Trp Gly 


207 


225 










230 










235 








240 


210 


Asp 


Met 


Pro 


Cys 


Gly 


Ser 


Lys 


Glu 


Lys 


Pro Gly Val 


Tyr 


Thr 


Asp Val 


211 










245 










250 










255 


214 


Cys 


Thr 


His 


He 


Arg 


Trp 


He 


Gin 


Asn 


He 


Leu 


Arg 


Asn 


Lys 


Trp Leu 


215 








260 










265 










270 





218 <210> SEQ ID NO: 3 
220 <211> LENGTH: 12 
222 <212> TYPE: DNA 

224 <213> ORGANISM: Artificial Sequence 
226 <220> FEATURE: 

228 <223> OTHER INFORMATION: Description of Artificial Sequence 

230 <400> SEQUENCE: 3 

231 gatctgcggt ga 

234 <210> SEQ ID NO: 4 
236 <211> LENGTH: 24 
238 <212> TYPE: DNA 

240 <213> ORGANISM: Artificial Sequence 
242 <220> FEATURE: 

244 <223> OTHER INFORMATION: Description of Artificial Sequence 

246 <400> SEQUENCE: 4 

247 agcactctcc agcctctcac cgca 
250 <210> SEQ ID NO: 5 

252 <211> LENGTH: 12 
254 <212> TYPE: DNA 

256 <213> ORGANISM: Artificial Sequence 
258 <220> FEATURE: 

260 <223> OTHER INFORMATION: Description of Artificial Sequence 

263 <400> SEQUENCE: 5 

264 gatctgttca tg 

267 <210> SEQ ID NO: 6 
269 <211> LENGTH: 24 
271 <212> TYPE: DNA 

273 <213> ORGANISM: Artificial Sequence 
275 <220> FEATURE: 

277 <223> OTHER INFORMATION: Description of Artificial Sequence 
279 <400> SEQUENCE: 6 






Polynucleotide 
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RAW SEQUENCE LISTING DATE: 09/24/2001 

PATENT APPLICATION: US/09/486,247 TIME: 12:08:14 

Input Set : A:\8484081999.txt 

Output Set: N:\CRF3\09242001\I486247.raw 



280 accgacgtcg actatccatg aaca 

283 <210> SEQ ID NO: 7 

285 <211> LENGTH: 12 

287 <212> TYPE: DNA 

289 <213> ORGANISM: Artificial Sequence 

291 <220> FEATURE: 

293 <223> OTHER INFORMATION: Description of Artificial Sequence 

295 <400> SEQUENCE: 7 

296 gatcttccct eg 
299 <210> SEQ ID NO: 8 
301 <211> LENGTH: 24 
303 <212> TYPE: DNA 

305 <213> ORGANISM: Artificial Sequence 

307 <220> FEATURE: V 

309 <223> OTHER INFORMATION: Description of Artificial Sequence: 

312 <400> SEQUENCE: 8 

313 aggcaactgt gctatccgag ggaa 



24 
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VERIFICATION SUMMARY DATE: 09/24/2001 

PATENT APPLICATION: US/09/486,247 TIME: 12:08:15 

Input Set : A:\8484081999.txt 

Output Set: N:\CRF3\09242001\I486247.raw 
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pplication No.: 09/486.247 



NOTICE TO COMPLY WITH REQUIREMENTS FOR PATENT APPLICATIONS CONTAINING 
NUCLEOTIDE SEQUENCE AND/OR AMINO ACID SEQUENCE DISCLOSURES 

Applicant must file the items indicated below within the time period set the Office action to which the 
Notice is attached to avoid abandonment under 35 U.S.C. § 133 (extensions of time may be 
obtained under the provisions of 37 CFR 1.136(a)). 

The nucleotide and/or amino acid sequence disclosure contained in this application does not comply 
with the requirements for such a disclosure as set forth in 37 C.F.R. 1 .821 - 1 .825 for the following 
reason(s): 



□ 



□ 

□ 



1. This application clearly fails to comply with the requirements of 37 C.F.R. 1.821-1.825. Applicant's 
attention is directed to the final rulemaking notice published at 55 FR 18230 (May 1, 1990). and 1114 
OG 29 (May 15, 1990). If the effective filing date is on or after July 1, 1998. see the final rulemaking 
notice published at 63 FR 29620 (June 1, 1998) and 1211 OG 82 (June 23. 1998). 

2. This application does not contain, as a separate part of the disclosure on paper copy, a "Sequence 
Listing" as required by 37 C.F.R. 1, 821(c). 

3. A copy of the "Sequence Listing" in computer readable form has not been submitted as required by 
37 C.F.R. 1.821(e). 



S4. A copy of the "Sequence Listing" in computer readable form has been submitted. However, the 
content of the computer readable form does not comply with the requirements of 37 C.F.R. 1 .822 
and/or 1.823, as indicated on the attached copy of the marked -up "Raw Sequence Listing." 

□ 



□ 
□ 



5. The computer readable form that has been filed with this application has been found to be damaged 
and/or unreadable as indicated on the attached CRF Diskette Problem Report. A Substitute 
computer readable form must be submitted as required by 37 C.F.R. 1.825(d). 

6. The paper copy of the "Sequence Listing" is not the same as the computer readable from of the 
"Sequence Listing" as required by 37 C.F.R. 1.821(e). 

7. Other: . 



Applicant Must Provide: 

An substitute computer readable form (CRF) copy of the "Sequence Listing". 

An substitute paper copy of the "Sequence Listing", as well as an amendment directing its entry into the 
specification. 

A statement that the content of the paper and computer readable copies are the same and, where 
applicable, include no new matter, as required by 37 C.F.R. 1.821(e) or 1.821(f) or 1.821(g) or 
1.825(b) or 1.825(d). 

For questions regarding compliance to these requirements, please contact: 

For Rules Interpretation, call (703) 308-4216 
For CRF Submission Help, call (703) 308-4212 
Patentin Software Program Support 

Technical Assistance 703-287-0200 

To Purchase Patentin Software 703-306-2600 

PLEASE RETURN A COPY OF THIS NOTICE WITH YOUR REPLY 



Raw Sequence Listing Error Summary 



ERROR DETECTED SUGGESTED CORRECTION 
ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 



SERIAL NUMBER: 



1 Wrapped Nucleics 

Wrapped Aminos 



10 



IP 



12 



The numberAext at the end of each line "wrapped" down to the next line. This may occur if your file 
was retrieved in a word processor after creating it Please adjust your right margin to .3; this will 
prevent *Vrapping." 



_In valid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 



3 Misaligned Amino 

Numbering 

4 Non-ASCn 



Variable Length 



Patentin 2.0 
" "bug" 



The numbering under each 5*** amino acid is misaligned. Do not use tab codes between numbers; 
use space characters, instead 

The submitted file was not saved in ASCn(IX)S) text, as required by the Sequence Rules. Please 
ensure your subsequent submission is saved in ASCII text 

Sequence(s) contain n's or Xaa*s representing more than one residue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug" in Patentin version 2.0 has caused the <220>-<223> section to be missing from amino acid 

sequences(s) . Normally, Patentin would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 



7 Skipped Sequences Sequence(s) 

(OLD RULES) 



missing. If intentional, please insert the following lines for each skipped sequence: 



(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 



_Skipped Sequences 
(NEW RULES) 



Sequence(s) _ 



missing. If intentional, please insert the following lines for each skipped sequence. 



_Use ofn'sor Xaa's 
(NEW RULES) 

_Invalid<213> 
Response 

Use of<220> 



Patentin 2.0 
""bug" . 



<2 10> sequence id number 
<:400> sequence id number 
000 

Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa*s are present 

In <220> to <223> section, please explain location of n or Xaa, and w^ch residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <213> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <213> response is Unknown or 
is Artificial Sequence 

Sequence(s) t missing the <220> "Feature" and associated numeric identifiers and responses. 
Use of <220> to <223> is MANDATORY if <2I3> "Organism" response is "Artificial Sequence" or 
"Unknown." Please explain source of genetic material in <220> to <223> section. 
(See *Tederal Register," 06/01/1998, Vol. 63, No, 104, pp. 29631-32) (Sec. 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentin version 2.0, This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to/loppy disk. 



AMC - Biotechnology Systems Branch - 06/04/2001 



